AITopics | semi-markov decision process

Collaborating Authors

semi-markov decision process

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs

Neural Information Processing SystemsDec-24-2025, 19:37:53 GMT

We present two elegant solutions for modeling continuous-time dynamics, in a novel model-based reinforcement learning (RL) framework for semi-Markov decision processes (SMDPs), using neural ordinary differential equations (ODEs). Our models accurately characterize continuous-time dynamics and enable us to develop high-performing policies using a small amount of data. We also develop a model-based approach for optimizing time schedules to reduce interaction rates with the environment while maintaining the near-optimal performance, which is not possible for model-free methods. We experimentally demonstrate the efficacy of our methods across various continuous-time domains.

model-based reinforcement learning, name change, semi-markov decision process, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.33)

Add feedback

Average-reward reinforcement learning in semi-Markov decision processes via relative value iteration

Yu, Huizhen, Wan, Yi, Sutton, Richard S.

arXiv.org Artificial IntelligenceDec-9-2025

This paper applies the authors' recent results on asynchronous stochastic approximation (SA) in the Borkar-Meyn framework to reinforcement learning in average-reward semi-Markov decision processes (SMDPs). We establish the convergence of an asynchronous SA analogue of Schweitzer's classical relative value iteration algorithm, RVI Q-learning, for finite-space, weakly communicating SMDPs. In particular, we show that the algorithm converges almost surely to a compact, connected subset of solutions to the average-reward optimality equation, with convergence to a unique, sample path-dependent solution under additional stepsize and asynchrony conditions. Moreover, to make full use of the SA framework, we introduce new monotonicity conditions for estimating the optimal reward rate in RVI Q-learning. These conditions substantially expand the previously considered algorithmic framework and are addressed through novel arguments in the stability and convergence analysis of RVI Q-learning.

assum, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2512.06218

Country: North America > Canada (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.84)

Add feedback

Review for NeurIPS paper: Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs

Neural Information Processing SystemsFeb-7-2025, 12:45:32 GMT

Summary and Contributions: The paper proposes a method for utilizing ODEs to represent dynamics for continuous-time decision-making problems with the aim of They also target filling a perceived gap in the literature of Deep RL for continuous-time problems, where most publications are model-free and discretize time if it is continuous. They claim that their approach leads to lower dependence on vast amounts of training data, better performance and that the model-based approach is well-founded. I tend to agree, although this is not exactly my area. I also believe the importance of connecting ODEs and other explicit models is critical for extending RL methods to important problems in physics, chemistry, epidemiology and population modelling.

model-based reinforcement learning, neurips paper, semi-markov decision process, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback

Review for NeurIPS paper: Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs

Neural Information Processing SystemsFeb-7-2025, 12:45:25 GMT

The reviewers all found the paper to make a reasonable contribution. While there are some limitations that were pointed out, the rebuttal addressed them well and the experiments were appreciated.

model-based reinforcement learning, neurips paper, semi-markov decision process, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.40)

Add feedback

Increasing Information for Model Predictive Control with Semi-Markov Decision Processes

Boucher, Rémy Hosseinkhan, Semeraro, Onofrio, Mathelin, Lionel

arXiv.org Artificial IntelligenceJan-28-2025

Recent works in Learning-Based Model Predictive Control of dynamical systems show impressive sample complexity performances using criteria from Information Theory to accelerate the learning procedure. However, the sequential exploration opportunities are limited by the system local state, restraining the amount of information of the observations from the current exploration trajectory. This article resolves this limitation by introducing temporal abstraction through the framework of Semi-Markov Decision Processes. The framework increases the total information of the gathered data for a fixed sampling budget, thus reducing the sample complexity.

artificial intelligence, machine learning, reinforcement learning, (13 more...)

arXiv.org Artificial Intelligence

2501.17256

Country:

North America > United States > Massachusetts (0.28)
North America > United States > California (0.28)

Genre: Research Report (0.82)

Industry: Energy > Oil & Gas > Upstream (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.49)

Add feedback

Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs

Neural Information Processing SystemsOct-11-2024, 15:44:31 GMT

model-based reinforcement learning, neural ode, semi-markov decision process

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.69)

Add feedback

Reinforcement Learning in a Physics-Inspired Semi-Markov Environment

Bellinger, Colin, Coles, Rory, Crowley, Mark, Tamblyn, Isaac

arXiv.org Artificial IntelligenceApr-15-2020

Reinforcement learning (RL) has been demonstrated to have great potential in many applications of scientific discovery and design. Recent work includes, for example, the design of new structures and compositions of molecules for therapeutic drugs. Much of the existing work related to the application of RL to scientific domains, however, assumes that the available state representation obeys the Markov property. For reasons associated with time, cost, sensor accuracy, and gaps in scientific knowledge, many scientific design and discovery problems do not satisfy the Markov property. Thus, something other than a Markov decision process (MDP) should be used to plan / find the optimal policy. In this paper, we present a physics-inspired semi-Markov RL environment, namely the phase change environment. In addition, we evaluate the performance of value-based RL algorithms for both MDPs and partially observable MDPs (POMDPs) on the proposed environment. Our results demonstrate deep recurrent Q-networks (DRQN) significantly outperform deep Q-networks (DQN), and that DRQNs benefit from training with hindsight experience replay. Implications for the use of semi-Markovian RL and POMDPs for scientific laboratories are also discussed.

agent, change environment, phase change environment, (15 more...)

arXiv.org Artificial Intelligence

2004.07333

Country:

North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
North America > United States > District of Columbia > Washington (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback